Overview

Brought to you by YData

Dataset statistics

Number of variables20
Number of observations3475226
Missing cells2700745
Missing cells (%)3.9%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory530.3 MiB
Average record size in memory160.0 B

Variable types

Categorical4
DateTime2
Numeric13
Boolean1

Alerts

Airport_fee is highly overall correlated with tolls_amountHigh correlation
RatecodeID is highly overall correlated with improvement_surchargeHigh correlation
VendorID is highly overall correlated with improvement_surchargeHigh correlation
congestion_surcharge is highly overall correlated with improvement_surchargeHigh correlation
fare_amount is highly overall correlated with total_amount and 1 other fieldsHigh correlation
improvement_surcharge is highly overall correlated with RatecodeID and 2 other fieldsHigh correlation
tip_amount is highly overall correlated with total_amountHigh correlation
tolls_amount is highly overall correlated with Airport_feeHigh correlation
total_amount is highly overall correlated with fare_amount and 2 other fieldsHigh correlation
trip_distance is highly overall correlated with fare_amount and 1 other fieldsHigh correlation
VendorID is highly imbalanced (62.0%) Imbalance
store_and_fwd_flag is highly imbalanced (97.4%) Imbalance
improvement_surcharge is highly imbalanced (89.3%) Imbalance
congestion_surcharge is highly imbalanced (67.8%) Imbalance
passenger_count has 540149 (15.5%) missing values Missing
RatecodeID has 540149 (15.5%) missing values Missing
store_and_fwd_flag has 540149 (15.5%) missing values Missing
congestion_surcharge has 540149 (15.5%) missing values Missing
Airport_fee has 540149 (15.5%) missing values Missing
trip_distance is highly skewed (γ1 = 260.0643046) Skewed
fare_amount is highly skewed (γ1 = 1859.998402) Skewed
total_amount is highly skewed (γ1 = 1857.764441) Skewed
trip_distance has 90893 (2.6%) zeros Zeros
payment_type has 540149 (15.5%) zeros Zeros
extra has 1764424 (50.8%) zeros Zeros
mta_tax has 38170 (1.1%) zeros Zeros
tip_amount has 1118008 (32.2%) zeros Zeros
tolls_amount has 3259590 (93.8%) zeros Zeros
Airport_fee has 2706446 (77.9%) zeros Zeros

Reproduction

Analysis started2025-06-04 06:50:28.072454
Analysis finished2025-06-04 07:01:52.736556
Duration11 minutes and 24.66 seconds
Software versionydata-profiling vv4.16.1
Download configurationconfig.json

Variables

VendorID
Categorical

High correlation  Imbalance 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size26.5 MiB
2
2719860 
1
753671 
7
 
1206
6
 
489

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters3475226
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row2
5th row2

Common Values

ValueCountFrequency (%)
2 2719860
78.3%
1 753671
 
21.7%
7 1206
 
< 0.1%
6 489
 
< 0.1%

Length

2025-06-04T12:31:53.001132image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-06-04T12:31:53.247947image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
2 2719860
78.3%
1 753671
 
21.7%
7 1206
 
< 0.1%
6 489
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
2 2719860
78.3%
1 753671
 
21.7%
7 1206
 
< 0.1%
6 489
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3475226
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 2719860
78.3%
1 753671
 
21.7%
7 1206
 
< 0.1%
6 489
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 3475226
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 2719860
78.3%
1 753671
 
21.7%
7 1206
 
< 0.1%
6 489
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3475226
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 2719860
78.3%
1 753671
 
21.7%
7 1206
 
< 0.1%
6 489
 
< 0.1%
Distinct1672077
Distinct (%)48.1%
Missing0
Missing (%)0.0%
Memory size26.5 MiB
Minimum2024-12-31 20:47:55
Maximum2025-02-01 00:00:44
Invalid dates0
Invalid dates (%)0.0%
2025-06-04T12:31:53.516031image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:31:53.805025image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct1671993
Distinct (%)48.1%
Missing0
Missing (%)0.0%
Memory size26.5 MiB
Minimum2024-12-18 07:52:40
Maximum2025-02-01 23:44:11
Invalid dates0
Invalid dates (%)0.0%
2025-06-04T12:31:54.073989image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:31:54.359124image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

passenger_count
Real number (ℝ)

Missing 

Distinct10
Distinct (%)< 0.1%
Missing540149
Missing (%)15.5%
Infinite0
Infinite (%)0.0%
Mean1.297859
Minimum0
Maximum9
Zeros24656
Zeros (%)0.7%
Negative0
Negative (%)0.0%
Memory size26.5 MiB
2025-06-04T12:31:54.604078image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q11
median1
Q31
95-th percentile3
Maximum9
Range9
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.75075028
Coefficient of variation (CV)0.57845289
Kurtosis11.314653
Mean1.297859
Median Absolute Deviation (MAD)0
Skewness3.0571207
Sum3809316
Variance0.56362598
MonotonicityNot monotonic
2025-06-04T12:31:55.008987image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
1 2322434
66.8%
2 407761
 
11.7%
3 91409
 
2.6%
4 59009
 
1.7%
0 24656
 
0.7%
5 17786
 
0.5%
6 12004
 
0.3%
8 11
 
< 0.1%
7 4
 
< 0.1%
9 3
 
< 0.1%
(Missing) 540149
 
15.5%
ValueCountFrequency (%)
0 24656
 
0.7%
1 2322434
66.8%
2 407761
 
11.7%
3 91409
 
2.6%
4 59009
 
1.7%
5 17786
 
0.5%
6 12004
 
0.3%
7 4
 
< 0.1%
8 11
 
< 0.1%
9 3
 
< 0.1%
ValueCountFrequency (%)
9 3
 
< 0.1%
8 11
 
< 0.1%
7 4
 
< 0.1%
6 12004
 
0.3%
5 17786
 
0.5%
4 59009
 
1.7%
3 91409
 
2.6%
2 407761
 
11.7%
1 2322434
66.8%
0 24656
 
0.7%

trip_distance
Real number (ℝ)

High correlation  Skewed  Zeros 

Distinct4545
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.8551262
Minimum0
Maximum276423.57
Zeros90893
Zeros (%)2.6%
Negative0
Negative (%)0.0%
Memory size26.5 MiB
2025-06-04T12:31:55.216577image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.39
Q10.98
median1.67
Q33.1
95-th percentile11.83
Maximum276423.57
Range276423.57
Interquartile range (IQR)2.12

Descriptive statistics

Standard deviation564.6016
Coefficient of variation (CV)96.428596
Kurtosis82905.777
Mean5.8551262
Median Absolute Deviation (MAD)0.86
Skewness260.0643
Sum20347887
Variance318774.97
MonotonicityNot monotonic
2025-06-04T12:31:55.473760image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 90893
 
2.6%
0.9 43590
 
1.3%
0.8 42866
 
1.2%
1 42710
 
1.2%
1.1 41354
 
1.2%
1.2 39968
 
1.2%
0.7 39967
 
1.2%
1.3 37724
 
1.1%
1.4 35433
 
1.0%
0.6 35175
 
1.0%
Other values (4535) 3025546
87.1%
ValueCountFrequency (%)
0 90893
2.6%
0.01 11118
 
0.3%
0.02 3546
 
0.1%
0.03 1603
 
< 0.1%
0.04 1252
 
< 0.1%
0.05 1106
 
< 0.1%
0.06 926
 
< 0.1%
0.07 820
 
< 0.1%
0.08 727
 
< 0.1%
0.09 630
 
< 0.1%
ValueCountFrequency (%)
276423.57 1
< 0.1%
276099.95 1
< 0.1%
222167.49 1
< 0.1%
206137.99 1
< 0.1%
202771.63 1
< 0.1%
189687.43 1
< 0.1%
181139.99 1
< 0.1%
168079.57 1
< 0.1%
167452.94 1
< 0.1%
164959.95 1
< 0.1%

RatecodeID
Real number (ℝ)

High correlation  Missing 

Distinct7
Distinct (%)< 0.1%
Missing540149
Missing (%)15.5%
Infinite0
Infinite (%)0.0%
Mean2.4825345
Minimum1
Maximum99
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size26.5 MiB
2025-06-04T12:31:55.694127image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q31
95-th percentile2
Maximum99
Range98
Interquartile range (IQR)0

Descriptive statistics

Standard deviation11.632772
Coefficient of variation (CV)4.685845
Kurtosis64.754873
Mean2.4825345
Median Absolute Deviation (MAD)0
Skewness8.164267
Sum7286430
Variance135.32138
MonotonicityNot monotonic
2025-06-04T12:31:55.888757image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
1 2756472
79.3%
2 94420
 
2.7%
99 41963
 
1.2%
5 26501
 
0.8%
3 8622
 
0.2%
4 7092
 
0.2%
6 7
 
< 0.1%
(Missing) 540149
 
15.5%
ValueCountFrequency (%)
1 2756472
79.3%
2 94420
 
2.7%
3 8622
 
0.2%
4 7092
 
0.2%
5 26501
 
0.8%
6 7
 
< 0.1%
99 41963
 
1.2%
ValueCountFrequency (%)
99 41963
 
1.2%
6 7
 
< 0.1%
5 26501
 
0.8%
4 7092
 
0.2%
3 8622
 
0.2%
2 94420
 
2.7%
1 2756472
79.3%

store_and_fwd_flag
Boolean

Imbalance  Missing 

Distinct2
Distinct (%)< 0.1%
Missing540149
Missing (%)15.5%
Memory size6.6 MiB
False
2927431 
True
 
7646
(Missing)
540149 
ValueCountFrequency (%)
False 2927431
84.2%
True 7646
 
0.2%
(Missing) 540149
 
15.5%
2025-06-04T12:31:56.117319image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

PULocationID
Real number (ℝ)

Distinct261
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean165.19158
Minimum1
Maximum265
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size26.5 MiB
2025-06-04T12:31:56.420169image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile48
Q1132
median162
Q3234
95-th percentile249
Maximum265
Range264
Interquartile range (IQR)102

Descriptive statistics

Standard deviation64.529483
Coefficient of variation (CV)0.39063422
Kurtosis-0.83359167
Mean165.19158
Median Absolute Deviation (MAD)62
Skewness-0.28915174
Sum5.7407806 × 108
Variance4164.0541
MonotonicityNot monotonic
2025-06-04T12:31:56.804018image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
161 169977
 
4.9%
237 163703
 
4.7%
236 155647
 
4.5%
132 146137
 
4.2%
230 125829
 
3.6%
186 119131
 
3.4%
162 117930
 
3.4%
142 110585
 
3.2%
239 96614
 
2.8%
163 95906
 
2.8%
Other values (251) 2173767
62.6%
ValueCountFrequency (%)
1 377
 
< 0.1%
2 6
 
< 0.1%
3 175
 
< 0.1%
4 7482
0.2%
5 3
 
< 0.1%
6 87
 
< 0.1%
7 3192
0.1%
8 22
 
< 0.1%
9 117
 
< 0.1%
10 1329
 
< 0.1%
ValueCountFrequency (%)
265 1380
 
< 0.1%
264 8141
 
0.2%
263 67409
1.9%
262 49609
1.4%
261 16651
 
0.5%
260 1476
 
< 0.1%
259 299
 
< 0.1%
258 708
 
< 0.1%
257 285
 
< 0.1%
256 2353
 
0.1%

DOLocationID
Real number (ℝ)

Distinct260
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean164.12518
Minimum1
Maximum265
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size26.5 MiB
2025-06-04T12:31:57.103948image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile43
Q1113
median162
Q3234
95-th percentile257
Maximum265
Range264
Interquartile range (IQR)121

Descriptive statistics

Standard deviation69.401686
Coefficient of variation (CV)0.42285826
Kurtosis-0.9349258
Mean164.12518
Median Absolute Deviation (MAD)68
Skewness-0.35704638
Sum5.7037208 × 108
Variance4816.5941
MonotonicityNot monotonic
2025-06-04T12:31:57.364051image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
236 161376
 
4.6%
237 149970
 
4.3%
161 131258
 
3.8%
230 108177
 
3.1%
170 100060
 
2.9%
142 98982
 
2.8%
239 97559
 
2.8%
162 93798
 
2.7%
141 92675
 
2.7%
68 89232
 
2.6%
Other values (250) 2352139
67.7%
ValueCountFrequency (%)
1 6873
0.2%
2 4
 
< 0.1%
3 312
 
< 0.1%
4 15012
0.4%
5 8
 
< 0.1%
6 89
 
< 0.1%
7 7376
0.2%
8 37
 
< 0.1%
9 346
 
< 0.1%
10 3365
 
0.1%
ValueCountFrequency (%)
265 12086
 
0.3%
264 11976
 
0.3%
263 73889
2.1%
262 54149
1.6%
261 16521
 
0.5%
260 2464
 
0.1%
259 414
 
< 0.1%
258 1172
 
< 0.1%
257 1323
 
< 0.1%
256 6399
 
0.2%

payment_type
Real number (ℝ)

Zeros 

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.0366229
Minimum0
Maximum5
Zeros540149
Zeros (%)15.5%
Negative0
Negative (%)0.0%
Memory size26.5 MiB
2025-06-04T12:31:57.573720image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median1
Q31
95-th percentile2
Maximum5
Range5
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.70133341
Coefficient of variation (CV)0.67655594
Kurtosis5.5771896
Mean1.0366229
Median Absolute Deviation (MAD)0
Skewness1.5995141
Sum3602499
Variance0.49186855
MonotonicityNot monotonic
2025-06-04T12:31:57.763867image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
1 2444393
70.3%
0 540149
 
15.5%
2 390429
 
11.2%
4 76481
 
2.2%
3 23773
 
0.7%
5 1
 
< 0.1%
ValueCountFrequency (%)
0 540149
 
15.5%
1 2444393
70.3%
2 390429
 
11.2%
3 23773
 
0.7%
4 76481
 
2.2%
5 1
 
< 0.1%
ValueCountFrequency (%)
5 1
 
< 0.1%
4 76481
 
2.2%
3 23773
 
0.7%
2 390429
 
11.2%
1 2444393
70.3%
0 540149
 
15.5%

fare_amount
Real number (ℝ)

High correlation  Skewed 

Distinct11538
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean17.081803
Minimum-900
Maximum863372.12
Zeros1398
Zeros (%)< 0.1%
Negative144118
Negative (%)4.1%
Memory size26.5 MiB
2025-06-04T12:31:57.985125image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum-900
5-th percentile3.7
Q18.6
median12.11
Q319.5
95-th percentile52
Maximum863372.12
Range864272.12
Interquartile range (IQR)10.9

Descriptive statistics

Standard deviation463.47292
Coefficient of variation (CV)27.132553
Kurtosis3464796.3
Mean17.081803
Median Absolute Deviation (MAD)4.89
Skewness1859.9984
Sum59363125
Variance214807.15
MonotonicityNot monotonic
2025-06-04T12:31:58.257852image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
8.6 150277
 
4.3%
7.9 150262
 
4.3%
9.3 147090
 
4.2%
7.2 144202
 
4.1%
10 142048
 
4.1%
10.7 133306
 
3.8%
6.5 129846
 
3.7%
11.4 125316
 
3.6%
12.1 116814
 
3.4%
12.8 106638
 
3.1%
Other values (11528) 2129427
61.3%
ValueCountFrequency (%)
-900 1
 
< 0.1%
-850 1
 
< 0.1%
-826.2 1
 
< 0.1%
-700 5
< 0.1%
-634.4 1
 
< 0.1%
-600 2
 
< 0.1%
-595.2 1
 
< 0.1%
-579.8 1
 
< 0.1%
-550 1
 
< 0.1%
-541.3 1
 
< 0.1%
ValueCountFrequency (%)
863372.12 1
 
< 0.1%
2450.9 1
 
< 0.1%
1309.2 1
 
< 0.1%
950 3
< 0.1%
936.8 1
 
< 0.1%
900 2
< 0.1%
899.99 2
< 0.1%
893.75 1
 
< 0.1%
850 1
 
< 0.1%
826.2 1
 
< 0.1%

extra
Real number (ℝ)

Zeros 

Distinct77
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.3177367
Minimum-7.5
Maximum15
Zeros1764424
Zeros (%)50.8%
Negative29596
Negative (%)0.9%
Memory size26.5 MiB
2025-06-04T12:31:58.506079image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum-7.5
5-th percentile0
Q10
median0
Q32.5
95-th percentile5
Maximum15
Range22.5
Interquartile range (IQR)2.5

Descriptive statistics

Standard deviation1.8615087
Coefficient of variation (CV)1.412656
Kurtosis2.8436517
Mean1.3177367
Median Absolute Deviation (MAD)0
Skewness1.4616594
Sum4579432.8
Variance3.4652146
MonotonicityNot monotonic
2025-06-04T12:31:58.768223image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1764424
50.8%
1 537935
 
15.5%
2.5 532712
 
15.3%
3.25 200468
 
5.8%
5 109568
 
3.2%
4.25 109078
 
3.1%
5.75 80765
 
2.3%
3.5 34386
 
1.0%
6 22872
 
0.7%
7.5 16872
 
0.5%
Other values (67) 66146
 
1.9%
ValueCountFrequency (%)
-7.5 460
 
< 0.1%
-6 779
 
< 0.1%
-5.75 2
 
< 0.1%
-5.25 1
 
< 0.1%
-5 2322
 
0.1%
-4.25 2
 
< 0.1%
-3.25 7
 
< 0.1%
-2.5 9982
0.3%
-2 1
 
< 0.1%
-1.75 4
 
< 0.1%
ValueCountFrequency (%)
15 1
 
< 0.1%
14.25 3
 
< 0.1%
13.25 1
 
< 0.1%
12.5 1399
 
< 0.1%
11.75 736
 
< 0.1%
11.5 1
 
< 0.1%
11 1916
 
0.1%
10.75 499
 
< 0.1%
10.25 1184
 
< 0.1%
10 6414
0.2%

mta_tax
Real number (ℝ)

Zeros 

Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.47809905
Minimum-0.5
Maximum10.5
Zeros38170
Zeros (%)1.1%
Negative57140
Negative (%)1.6%
Memory size26.5 MiB
2025-06-04T12:31:59.100105image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum-0.5
5-th percentile0.5
Q10.5
median0.5
Q30.5
95-th percentile0.5
Maximum10.5
Range11
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.13746227
Coefficient of variation (CV)0.28751838
Kurtosis83.704664
Mean0.47809905
Median Absolute Deviation (MAD)0
Skewness-5.7552602
Sum1661502.2
Variance0.018895874
MonotonicityNot monotonic
2025-06-04T12:31:59.395856image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
0.5 3379839
97.3%
-0.5 57140
 
1.6%
0 38170
 
1.1%
1 64
 
< 0.1%
10.5 5
 
< 0.1%
4.75 3
 
< 0.1%
4 2
 
< 0.1%
3.75 2
 
< 0.1%
6.5 1
 
< 0.1%
ValueCountFrequency (%)
-0.5 57140
 
1.6%
0 38170
 
1.1%
0.5 3379839
97.3%
1 64
 
< 0.1%
3.75 2
 
< 0.1%
4 2
 
< 0.1%
4.75 3
 
< 0.1%
6.5 1
 
< 0.1%
10.5 5
 
< 0.1%
ValueCountFrequency (%)
10.5 5
 
< 0.1%
6.5 1
 
< 0.1%
4.75 3
 
< 0.1%
4 2
 
< 0.1%
3.75 2
 
< 0.1%
1 64
 
< 0.1%
0.5 3379839
97.3%
0 38170
 
1.1%
-0.5 57140
 
1.6%

tip_amount
Real number (ℝ)

High correlation  Zeros 

Distinct4197
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.9598128
Minimum-86
Maximum400
Zeros1118008
Zeros (%)32.2%
Negative124
Negative (%)< 0.1%
Memory size26.5 MiB
2025-06-04T12:31:59.618888image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum-86
5-th percentile0
Q10
median2.45
Q33.93
95-th percentile10
Maximum400
Range486
Interquartile range (IQR)3.93

Descriptive statistics

Standard deviation3.7796812
Coefficient of variation (CV)1.2770001
Kurtosis178.95533
Mean2.9598128
Median Absolute Deviation (MAD)2.25
Skewness5.3446306
Sum10286018
Variance14.28599
MonotonicityNot monotonic
2025-06-04T12:31:59.881868image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1118008
32.2%
2 156750
 
4.5%
1 120556
 
3.5%
3 67562
 
1.9%
5 39398
 
1.1%
1.5 31038
 
0.9%
4 29170
 
0.8%
2.95 25146
 
0.7%
2.8 22460
 
0.6%
3.15 22112
 
0.6%
Other values (4187) 1843026
53.0%
ValueCountFrequency (%)
-86 1
< 0.1%
-70 1
< 0.1%
-52.45 1
< 0.1%
-50.05 1
< 0.1%
-33.66 1
< 0.1%
-25 1
< 0.1%
-19.61 1
< 0.1%
-18.84 1
< 0.1%
-17 1
< 0.1%
-16.34 1
< 0.1%
ValueCountFrequency (%)
400 1
< 0.1%
360 1
< 0.1%
333.33 1
< 0.1%
333.3 1
< 0.1%
303 1
< 0.1%
285 1
< 0.1%
261 1
< 0.1%
228 1
< 0.1%
225.05 1
< 0.1%
220 1
< 0.1%

tolls_amount
Real number (ℝ)

High correlation  Zeros 

Distinct1234
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.4493081
Minimum-126.94
Maximum170.94
Zeros3259590
Zeros (%)93.8%
Negative4559
Negative (%)0.1%
Memory size26.5 MiB
2025-06-04T12:32:00.184071image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum-126.94
5-th percentile0
Q10
median0
Q30
95-th percentile6.94
Maximum170.94
Range297.88
Interquartile range (IQR)0

Descriptive statistics

Standard deviation2.0025818
Coefficient of variation (CV)4.4570347
Kurtosis88.950625
Mean0.4493081
Median Absolute Deviation (MAD)0
Skewness5.440691
Sum1561447.2
Variance4.0103339
MonotonicityNot monotonic
2025-06-04T12:32:00.465964image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 3259590
93.8%
6.94 192386
 
5.5%
-6.94 3738
 
0.1%
14.06 2249
 
0.1%
3.18 1689
 
< 0.1%
16.06 1607
 
< 0.1%
13.88 1263
 
< 0.1%
13.38 676
 
< 0.1%
15.38 535
 
< 0.1%
11.19 520
 
< 0.1%
Other values (1224) 10973
 
0.3%
ValueCountFrequency (%)
-126.94 1
< 0.1%
-96.94 1
< 0.1%
-82.69 1
< 0.1%
-74.76 1
< 0.1%
-48.28 1
< 0.1%
-48.18 1
< 0.1%
-47.04 1
< 0.1%
-45.94 1
< 0.1%
-44.94 1
< 0.1%
-44.88 1
< 0.1%
ValueCountFrequency (%)
170.94 1
< 0.1%
126.94 1
< 0.1%
123 1
< 0.1%
105.88 1
< 0.1%
96.94 1
< 0.1%
95 1
< 0.1%
84 1
< 0.1%
82.69 1
< 0.1%
81 1
< 0.1%
80 2
< 0.1%

improvement_surcharge
Categorical

High correlation  Imbalance 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size26.5 MiB
1.0
3377509 
-1.0
 
59530
0.0
 
37694
0.3
 
493

Length

Max length4
Median length3
Mean length3.0171298
Min length3

Characters and Unicode

Total characters10485208
Distinct characters5
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row1.0
3rd row1.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
1.0 3377509
97.2%
-1.0 59530
 
1.7%
0.0 37694
 
1.1%
0.3 493
 
< 0.1%

Length

2025-06-04T12:32:00.716443image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-06-04T12:32:00.906024image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
1.0 3437039
98.9%
0.0 37694
 
1.1%
0.3 493
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0 3512920
33.5%
. 3475226
33.1%
1 3437039
32.8%
- 59530
 
0.6%
3 493
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6950452
66.3%
Other Punctuation 3475226
33.1%
Dash Punctuation 59530
 
0.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 3512920
50.5%
1 3437039
49.5%
3 493
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
. 3475226
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 59530
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 10485208
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 3512920
33.5%
. 3475226
33.1%
1 3437039
32.8%
- 59530
 
0.6%
3 493
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10485208
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 3512920
33.5%
. 3475226
33.1%
1 3437039
32.8%
- 59530
 
0.6%
3 493
 
< 0.1%

total_amount
Real number (ℝ)

High correlation  Skewed 

Distinct21995
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean25.611292
Minimum-901
Maximum863380.37
Zeros559
Zeros (%)< 0.1%
Negative63037
Negative (%)1.8%
Memory size26.5 MiB
2025-06-04T12:32:01.134007image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum-901
5-th percentile8.75
Q115.2
median19.95
Q327.78
95-th percentile74
Maximum863380.37
Range864281.37
Interquartile range (IQR)12.58

Descriptive statistics

Standard deviation463.65848
Coefficient of variation (CV)18.103674
Kurtosis3459248.6
Mean25.611292
Median Absolute Deviation (MAD)5.61
Skewness1857.7644
Sum89005027
Variance214979.18
MonotonicityNot monotonic
2025-06-04T12:32:01.434139image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
17.7 23391
 
0.7%
12.6 21030
 
0.6%
21.9 18725
 
0.5%
13.5 18309
 
0.5%
16.86 18087
 
0.5%
19.38 17651
 
0.5%
16.02 17628
 
0.5%
18.54 17469
 
0.5%
21.06 17186
 
0.5%
15.18 17089
 
0.5%
Other values (21985) 3288661
94.6%
ValueCountFrequency (%)
-901 1
 
< 0.1%
-865.39 1
 
< 0.1%
-851 1
 
< 0.1%
-704.25 1
 
< 0.1%
-701 3
< 0.1%
-652.75 1
 
< 0.1%
-633.21 1
 
< 0.1%
-616.36 1
 
< 0.1%
-607.75 1
 
< 0.1%
-601 1
 
< 0.1%
ValueCountFrequency (%)
863380.37 1
< 0.1%
2506.71 1
< 0.1%
1311.7 1
< 0.1%
969.05 1
< 0.1%
953.5 1
< 0.1%
951 2
< 0.1%
903.5 1
< 0.1%
903.49 2
< 0.1%
901 1
< 0.1%
896.5 1
< 0.1%

congestion_surcharge
Categorical

High correlation  Imbalance  Missing 

Distinct3
Distinct (%)< 0.1%
Missing540149
Missing (%)15.5%
Memory size26.5 MiB
2.5
2660818 
0.0
 
225938
-2.5
 
48321

Length

Max length4
Median length3
Mean length3.0164633
Min length3

Characters and Unicode

Total characters8853552
Distinct characters5
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2.5
2nd row2.5
3rd row2.5
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
2.5 2660818
76.6%
0.0 225938
 
6.5%
-2.5 48321
 
1.4%
(Missing) 540149
 
15.5%

Length

2025-06-04T12:32:01.673779image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-06-04T12:32:01.860125image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
2.5 2709139
92.3%
0.0 225938
 
7.7%

Most occurring characters

ValueCountFrequency (%)
. 2935077
33.2%
2 2709139
30.6%
5 2709139
30.6%
0 451876
 
5.1%
- 48321
 
0.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 5870154
66.3%
Other Punctuation 2935077
33.2%
Dash Punctuation 48321
 
0.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 2709139
46.2%
5 2709139
46.2%
0 451876
 
7.7%
Other Punctuation
ValueCountFrequency (%)
. 2935077
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 48321
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 8853552
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 2935077
33.2%
2 2709139
30.6%
5 2709139
30.6%
0 451876
 
5.1%
- 48321
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8853552
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 2935077
33.2%
2 2709139
30.6%
5 2709139
30.6%
0 451876
 
5.1%
- 48321
 
0.5%

Airport_fee
Real number (ℝ)

High correlation  Missing  Zeros 

Distinct7
Distinct (%)< 0.1%
Missing540149
Missing (%)15.5%
Infinite0
Infinite (%)0.0%
Mean0.12391106
Minimum-1.75
Maximum6.75
Zeros2706446
Zeros (%)77.9%
Negative10411
Negative (%)0.3%
Memory size26.5 MiB
2025-06-04T12:32:02.033165image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum-1.75
5-th percentile0
Q10
median0
Q30
95-th percentile1.75
Maximum6.75
Range8.5
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.47250898
Coefficient of variation (CV)3.8132914
Kurtosis8.3496097
Mean0.12391106
Median Absolute Deviation (MAD)0
Skewness2.7957272
Sum363688.5
Variance0.22326474
MonotonicityNot monotonic
2025-06-04T12:32:02.213835image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
0 2706446
77.9%
1.75 218203
 
6.3%
-1.75 10411
 
0.3%
1.25 8
 
< 0.1%
5 7
 
< 0.1%
0.75 1
 
< 0.1%
6.75 1
 
< 0.1%
(Missing) 540149
 
15.5%
ValueCountFrequency (%)
-1.75 10411
 
0.3%
0 2706446
77.9%
0.75 1
 
< 0.1%
1.25 8
 
< 0.1%
1.75 218203
 
6.3%
5 7
 
< 0.1%
6.75 1
 
< 0.1%
ValueCountFrequency (%)
6.75 1
 
< 0.1%
5 7
 
< 0.1%
1.75 218203
 
6.3%
1.25 8
 
< 0.1%
0.75 1
 
< 0.1%
0 2706446
77.9%
-1.75 10411
 
0.3%
Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size26.5 MiB
0.75
2246495 
0.0
1222178 
-0.75
 
6553

Length

Max length5
Median length4
Mean length3.6502026
Min length3

Characters and Unicode

Total characters12685279
Distinct characters5
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.75 2246495
64.6%
0.0 1222178
35.2%
-0.75 6553
 
0.2%

Length

2025-06-04T12:32:02.443935image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-06-04T12:32:02.674063image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
0.75 2253048
64.8%
0.0 1222178
35.2%

Most occurring characters

ValueCountFrequency (%)
0 4697404
37.0%
. 3475226
27.4%
7 2253048
17.8%
5 2253048
17.8%
- 6553
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 9203500
72.6%
Other Punctuation 3475226
 
27.4%
Dash Punctuation 6553
 
0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 4697404
51.0%
7 2253048
24.5%
5 2253048
24.5%
Other Punctuation
ValueCountFrequency (%)
. 3475226
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 6553
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 12685279
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 4697404
37.0%
. 3475226
27.4%
7 2253048
17.8%
5 2253048
17.8%
- 6553
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12685279
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 4697404
37.0%
. 3475226
27.4%
7 2253048
17.8%
5 2253048
17.8%
- 6553
 
0.1%

Interactions

2025-06-04T12:31:01.854146image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:25:52.917168image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:26:11.991101image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:26:37.466958image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:27:01.805722image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:27:30.729008image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:28:02.393900image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:28:33.749238image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:29:03.183989image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:29:31.447026image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:30:00.282898image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:30:28.554000image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:30:47.013821image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:31:02.924050image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:25:53.980926image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:26:12.984047image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:26:39.250101image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:27:03.774157image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:27:33.160892image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:28:05.023421image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:28:36.211248image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:29:04.803834image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:29:33.763754image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:30:02.623971image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:30:29.533939image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:30:48.138699image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:31:04.232005image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:25:55.213334image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:26:14.726530image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:26:40.817485image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:27:06.348848image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:27:35.258671image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:28:07.145736image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:28:38.432737image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:29:06.766720image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:29:35.756525image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:30:05.113759image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:30:30.586341image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:30:49.086077image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:31:05.425538image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:25:56.959049image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:26:16.613751image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:26:43.173691image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:27:08.013158image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:27:37.865211image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:28:09.730421image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:28:41.330177image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:29:09.233794image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:29:38.079175image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:30:07.704900image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:30:32.223670image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:30:49.862303image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:31:06.513805image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:25:58.584031image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:26:18.793706image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:26:44.853902image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:27:10.470726image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:27:39.965567image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:28:11.840537image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:28:44.049413image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:29:11.354066image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:29:40.633801image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:30:10.423908image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:30:33.534664image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:30:51.273856image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:31:07.839066image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:26:00.117791image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:26:20.834384image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:26:45.989885image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:27:12.757305image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:27:42.348832image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:28:14.321902image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:28:46.785149image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:29:14.143935image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:29:43.042199image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:30:13.041067image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:30:34.830940image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:30:52.510071image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:31:08.872594image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:26:01.673757image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:26:22.906123image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:26:47.913650image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:27:15.243773image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:27:45.102805image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:28:17.008352image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:28:49.183967image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:29:16.793994image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:29:45.114996image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:30:15.111903image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:30:36.220251image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:30:53.862198image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:31:09.846457image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:26:02.935946image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:26:25.084048image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:26:49.499507image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:27:17.924189image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:27:47.769740image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:28:19.403714image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:28:51.478060image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:29:18.908804image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:29:47.498476image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:30:17.622225image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:30:37.795435image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:30:55.103779image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:31:10.844110image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:26:04.968095image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:26:27.673882image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:26:51.040534image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:27:20.068154image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:27:50.595788image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:28:21.683934image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:28:53.973699image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:29:21.133815image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:29:49.835118image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:30:19.906312image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:30:39.444040image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:30:56.192598image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:31:12.033878image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:26:06.545098image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:26:31.189380image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:26:53.173938image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:27:22.358392image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:27:53.109098image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:28:24.050078image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:28:56.166707image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:29:23.543839image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:29:52.034132image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:30:21.281946image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:30:40.919896image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:30:57.284196image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:31:13.498817image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:26:08.402285image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:26:32.821861image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:26:55.484053image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:27:24.533721image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:27:55.614015image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:28:27.084066image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:28:58.594028image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:29:25.254118image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:29:54.173987image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:30:23.279552image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:30:42.643883image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:30:58.378134image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:31:14.491308image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:26:10.136523image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:26:34.033006image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:26:57.730769image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:27:26.810934image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:27:58.187629image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:28:29.397000image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:29:00.655096image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:29:27.204081image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:29:56.369132image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:30:25.561105image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:30:44.229912image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:30:59.772359image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:31:15.483804image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:26:11.059573image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:26:35.465335image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:26:59.498832image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:27:28.797632image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:28:00.289152image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:28:31.353979image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:29:02.208264image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:29:29.263956image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:29:58.264068image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:30:27.379043image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:30:45.519681image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-04T12:31:00.774000image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Correlations

2025-06-04T12:32:02.847639image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Airport_feeDOLocationIDPULocationIDRatecodeIDVendorIDcbd_congestion_feecongestion_surchargeextrafare_amountimprovement_surchargemta_taxpassenger_countpayment_typestore_and_fwd_flagtip_amounttolls_amounttotal_amounttrip_distance
Airport_fee1.000-0.043-0.2220.3000.0860.1190.3030.1900.4180.2950.0630.031-0.0040.0040.2300.5340.4190.382
DOLocationID-0.0431.0000.085-0.0650.0100.1400.1480.022-0.0920.0440.026-0.0070.0250.0100.022-0.044-0.076-0.096
PULocationID-0.2220.0851.000-0.1430.0220.1770.202-0.010-0.1290.0590.028-0.0170.0160.005-0.005-0.117-0.119-0.143
RatecodeID0.300-0.065-0.1431.0000.2210.1580.415-0.1280.3480.922-0.2740.047-0.0000.0020.0500.4540.3200.266
VendorID0.0860.0100.0220.2211.0000.0300.0740.3290.0000.5860.0050.1360.0520.0780.0130.0050.0000.000
cbd_congestion_fee0.1190.1400.1770.1580.0301.0000.3760.2060.0000.2770.0010.0200.1720.0030.0210.0240.0000.000
congestion_surcharge0.3030.1480.2020.4150.0740.3761.0000.4480.0020.6960.0030.0270.3880.0070.0990.0820.0020.003
extra0.1900.022-0.010-0.1280.3290.2060.4481.0000.0680.4030.164-0.0580.2260.0490.3010.1420.1960.058
fare_amount0.418-0.092-0.1290.3480.0000.0000.0020.0681.0000.0000.0960.046-0.0800.0000.3580.3860.9560.790
improvement_surcharge0.2950.0440.0590.9220.5860.2770.6960.4030.0001.0000.0250.0450.3440.0090.0200.0630.0000.005
mta_tax0.0630.0260.028-0.2740.0050.0010.0030.1640.0960.0251.000-0.042-0.2370.0000.092-0.0300.1010.009
passenger_count0.031-0.007-0.0170.0470.1360.0200.027-0.0580.0460.045-0.0421.0000.0340.0350.0120.0400.0430.037
payment_type-0.0040.0250.016-0.0000.0520.1720.3880.226-0.0800.344-0.2370.0341.0000.008-0.0160.022-0.058-0.070
store_and_fwd_flag0.0040.0100.0050.0020.0780.0030.0070.0490.0000.0090.0000.0350.0081.0000.0030.0000.0000.000
tip_amount0.2300.022-0.0050.0500.0130.0210.0990.3010.3580.0200.0920.012-0.0160.0031.0000.2290.5320.291
tolls_amount0.534-0.044-0.1170.4540.0050.0240.0820.1420.3860.063-0.0300.0400.0220.0000.2291.0000.3970.365
total_amount0.419-0.076-0.1190.3200.0000.0000.0020.1960.9560.0000.1010.043-0.0580.0000.5320.3971.0000.764
trip_distance0.382-0.096-0.1430.2660.0000.0000.0030.0580.7900.0050.0090.037-0.0700.0000.2910.3650.7641.000

Missing values

2025-06-04T12:31:16.958734image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
A simple visualization of nullity by column.
2025-06-04T12:31:26.280429image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2025-06-04T12:31:41.148999image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

VendorIDtpep_pickup_datetimetpep_dropoff_datetimepassenger_counttrip_distanceRatecodeIDstore_and_fwd_flagPULocationIDDOLocationIDpayment_typefare_amountextramta_taxtip_amounttolls_amountimprovement_surchargetotal_amountcongestion_surchargeAirport_feecbd_congestion_fee
012025-01-01 00:18:382025-01-01 00:26:591.01.601.0N229237110.03.50.53.000.01.018.002.50.00.0
112025-01-01 00:32:402025-01-01 00:35:131.00.501.0N23623715.13.50.52.020.01.012.122.50.00.0
212025-01-01 00:44:042025-01-01 00:46:011.00.601.0N14114115.13.50.52.000.01.012.102.50.00.0
322025-01-01 00:14:272025-01-01 00:20:013.00.521.0N24424427.21.00.50.000.01.09.700.00.00.0
422025-01-01 00:21:342025-01-01 00:25:063.00.661.0N24411625.81.00.50.000.01.08.300.00.00.0
522025-01-01 00:48:242025-01-01 01:08:262.02.631.0N23968219.11.00.50.000.01.024.102.50.00.0
612025-01-01 00:14:472025-01-01 00:16:150.00.401.0N17017014.43.50.52.350.01.011.752.50.00.0
712025-01-01 00:39:272025-01-01 00:51:510.01.601.0N234148112.13.50.52.000.01.019.102.50.00.0
812025-01-01 00:53:432025-01-01 01:13:230.02.801.0N148170119.13.50.53.000.01.027.102.50.00.0
922025-01-01 00:00:022025-01-01 00:09:361.01.711.0N237262211.41.00.50.000.01.016.402.50.00.0
VendorIDtpep_pickup_datetimetpep_dropoff_datetimepassenger_counttrip_distanceRatecodeIDstore_and_fwd_flagPULocationIDDOLocationIDpayment_typefare_amountextramta_taxtip_amounttolls_amountimprovement_surchargetotal_amountcongestion_surchargeAirport_feecbd_congestion_fee
347521622025-01-31 23:58:202025-02-01 00:04:17NaN1.19NaNNaN1425007.650.00.50.00.01.012.40NaNNaN0.75
347521722025-01-31 23:38:252025-01-31 23:46:15NaN1.34NaNNaN2341000-4.750.00.50.00.01.05.00NaNNaN0.75
347521822025-01-31 23:26:032025-01-31 23:34:29NaN1.50NaNNaN799009.950.00.50.00.01.014.70NaNNaN0.75
347521922025-01-31 23:21:002025-01-31 23:36:00NaN2.12NaNNaN224144015.150.00.50.00.01.019.90NaNNaN0.75
347522022025-01-31 23:26:312025-01-31 23:40:04NaN1.85NaNNaN90144013.400.00.50.00.01.018.15NaNNaN0.75
347522122025-01-31 23:01:482025-01-31 23:16:29NaN3.35NaNNaN79237015.850.00.50.00.01.020.60NaNNaN0.75
347522222025-01-31 23:50:292025-02-01 00:17:27NaN8.73NaNNaN161116028.140.00.50.00.01.032.89NaNNaN0.75
347522322025-01-31 23:26:592025-01-31 23:43:01NaN2.64NaNNaN144246014.910.00.50.00.01.019.66NaNNaN0.75
347522422025-01-31 23:14:342025-01-31 23:34:52NaN3.16NaNNaN142107017.550.00.50.00.01.022.30NaNNaN0.75
347522522025-01-31 23:56:422025-02-01 00:07:27NaN2.29NaNNaN237238012.090.00.50.00.01.016.09NaNNaN0.00